摘要 :
Advances in multimedia data acquisition and storage technology have led to the growth of very large multimedia databases. Analyzing this huge amount of multimedia data to discover useful knowledge is a challenging problem. This ch...
展开
Advances in multimedia data acquisition and storage technology have led to the growth of very large multimedia databases. Analyzing this huge amount of multimedia data to discover useful knowledge is a challenging problem. This challenge has opened the opportunity for research in Multimedia Data Mining (MDM). Multimedia data mining can be defined as the process of finding interesting patterns from media data such as audio, video, image and text that are not ordinarily accessible by basic queries and associated results. The motivation for doing MDM is to use the discovered patterns to improve decision making. MDM has therefore attracted significant research efforts in developing methods and tools to organize, manage, search and perform domain specific tasks for data from domains such as surveillance, meetings, broadcast news, sports, archives, movies, medical data, as well as personal and online media collections. This paper presents a survey on the prob-lems and solutions in Multimedia Data Mining, approached from the following angles: feature extraction, transformation and representation techniques, data min-ing techniques, and current multimedia data mining systems in various application domains. We discuss main aspects of feature extraction, transformation and repre-sentation techniques. These aspects are: level of feature extraction, feature fusion, features synchronization, feature correlation discovery and accurate representa-tion of multimedia data. Comparison of MDM techniques with state of the art video processing, audio processing and image processing techniques is also provided. Similarly, we compare MDM techniques with the state of the art data mining tech-niques involving clustering, classification, sequence pattern mining, association rule mining and visualization. We review current multimedia data mining systems in detail, grouping them according to problem formulations and approaches. The re-view includes supervised and unsupervised discovery of events and actions from one or more continuous sequences. We also do a detailed analysis to understand what has been achieved and what are the remaining gaps where future research efforts could be focussed. We then conclude this survey with a look at open research directions.
收起
摘要 :
With advances in computing techniques, a large amount of high-resolution high-quality multimedia data (video and audio, and so on) has been collected in research laboratories in various scientific disciplines, particu-larly in cog...
展开
With advances in computing techniques, a large amount of high-resolution high-quality multimedia data (video and audio, and so on) has been collected in research laboratories in various scientific disciplines, particu-larly in cognitive and behavioral studies. How to automatically and effectively discover new knowledge from rich multimedia data poses a compelling chal-lenge because most state-of-the-art data mining techniques can only search and extract pre-defined patterns or knowledge from complex heterogeneous data. In light of this challenge, we propose a hybrid approach that allows scientists to use data mining as a first pass, and then forms a closed loop of visual analysis of current results followed by more data mining work inspired by visualization, the results of which can be in turn visualized and lead to the next round of visual exploration and analysis. In this way, new insights and hypotheses gleaned from the raw data and the current level of analysis can contribute to further analysis. As a first step toward this goal, we implement a visualization system with three critical components: (1) a smooth interface between visualization and data mining; (2) a flexible tool to explore and query temporal data derived from raw multimedia data; and (3) a seamless interface between raw multimedia data and derived data. We have developed various ways to visualize both temporal correlations and statistics of multiple derived variables as well as conditional and high-order statistics. Our visualization tool allows users to explore, compare and analyze multi-stream derived variables and simultaneously switch to access raw multimedia data.
收起
摘要 :
Over the past decades, data mining has proved to be a successful approach for extracting hidden knowledge from huge collections of structured digital data stored in databases. From the inception, Data mining was done primarily on ...
展开
Over the past decades, data mining has proved to be a successful approach for extracting hidden knowledge from huge collections of structured digital data stored in databases. From the inception, Data mining was done primarily on numerical set of data. Nowadays as large multimedia data sets such as audio, speech, text, web, image, video and combinations of several types are becoming increasingly available and are almost unstructured or semi-structured data by nature, which makes it difficult for human beings to extract the information without powerful tools. This drives the need to develop data mining techniques that can work on all kinds of data such as documents, images, and signals. This paper explores on survey of the current state of multimedia data mining and knowledge discovery, data mining efforts aimed at multimedia data, current approaches and well known techniques for mining multimedia data.
收起
摘要 :
Temporal data mining is still one of important research topic since there are application areas that need knowledge from temporal data such as sequential patterns, similar time sequences, cyclic and temporal association rules, and...
展开
Temporal data mining is still one of important research topic since there are application areas that need knowledge from temporal data such as sequential patterns, similar time sequences, cyclic and temporal association rules, and so on. Although there are many studies for temporal data mining, they do not deal with discovering knowledge from temporal interval data such as patient histories, purchaser histories, and web logs etc. We propose a new temporal data mining technique that can extract temporal interval relation rules from temporal interval data by using Allen's theory: a preprocessing algorithm designed for the generalization of temporal interval data and a temporal relation algorithm for mining temporal relation rules from the generalized temporal interval data. This technique can provide more useful knowledge in comparison with conventional data mining techniques.
收起
摘要 :
Every day of life, which merged with Computer erected information systems have stayed in use for numerous decades, several organizations have now built up computerized archives of their activities in the former. Those documents ar...
展开
Every day of life, which merged with Computer erected information systems have stayed in use for numerous decades, several organizations have now built up computerized archives of their activities in the former. Those documents are often valuable, in that the former patterns of behavior can be used to predict imminent behavior. Historical databases also are known as Temporal databases. It provides support for the efficient storage and querying of such information. In real life, Media information has time attributes either implicitly or explicitly known as temporal data. A temporal database that has time as the mandatory field is considered to make the system more practical and realistic. The levels of data onto temporal database are optimized in time base by encoding the temporal database for the efficient memory utilization. The idea is to perform temporal data mining on multimedia files in order to classify according to their prominence from the user outlook. Timestamp for multimedia data like resolution, timestamp of creation and modification analyzed based on the variation during optimization and reviewed.
收起
摘要 :
Background: In the increasingly large scale of the current multimedia information database, as described in various patents, big data technology is becoming more and more mature, and the problem of massive multimedia data mining i...
展开
Background: In the increasingly large scale of the current multimedia information database, as described in various patents, big data technology is becoming more and more mature, and the problem of massive multimedia data mining is becoming more and more important. Method: In order to solve these problems through the analysis of the status for multimedia data mining, the general system structure in the application of multimedia information mining technology is analyzed, and the different methods and principles of text mining, image mining, video mining, web data mining in multimedia information data mining are made detailed analysis and the advantage and disadvantages are contrasted. Through the design of different experiments, the effectiveness and inferior of different methods are verified. Result: In the improved algorithm, the mining result of the multimedia text data based on correlation analysis is very different from the actual multimedia text data, the result is closer to the original mining image result, so this method is more practical. Conclusion: Through the summary and discussion on the problems existed in the current method, the faced challenges of multimedia information data mining technology are pointed out, to lay the foundation for the further development of the related technology.
收起
摘要 :
Objective: In this paper, we focus on the issue of providing physicians with the capability of representing in a seamless way both temporal aspects of multimedia semistructured data and their temporal presentation requirements. Ba...
展开
Objective: In this paper, we focus on the issue of providing physicians with the capability of representing in a seamless way both temporal aspects of multimedia semistructured data and their temporal presentation requirements. Background: Semistructured data are data having some structure, that may be irregular or incomplete and does not necessarily conform to a fixed schema. Semi-structured data often contain the description of histories of the considered real world. The extensible Markup Language (XML) is becoming a cross compatible and standardized means for representing semistructured clinical data. In the field of medical informatics, there are many ongoing activities concerning XML. In the field of multimedia database systems, the topic related to the integration of several media objects (with their temporal aspects) have been considered both for data modeling and querying issues and for modeling multimedia presentations. Methodology: We first propose the Multimedia Temporal Graphical Model (MTGM), by representing a clinical database for cardiology patients undergoing cardiac angio-graphies and then describe it in a formal way. We deal with the problem of expressing MTGM data by XML and of managing MTGM clinical data through an XML-based system. We provide both a technique for translating (a part of) an MTGM database into an XML document and some techniques allowing us to obtain presentations defined by means of the Synchronized Multimedia Integration Language (SMIL) from MTGM presentations. Results: MTGM allows one to represent and store clinical information in a semistructured, temporal, and multimedia database. The physician can define multimedia presentations based on the stored data. Multimedia presentations are then stored in the same MTGM database together with temporal clinical information and are thus represented according to the same data model. A prototype based on an XML native database system has been designed and implemented. Discussion and conclusions: In this work we have considered the theoretical and methodological issues concerning the definition of a general data model for describing temporal and multimedia features of semistructured clinical information. Other research and application oriented features, which have not been considered in MTGM, could be investigated for completing MTGM with regard to its applicability to clinical domains: MTGM does not allow one to express times at different levels of granularities, i.e. with different time units, or with indeterminacy; besides the considered valid time, it could be interesting to manage also other temporal dimensions such as the transaction and availability times. Besides being useful for managing multimedia data stored according to widely accepted standards as MPEG and DICOM, nowadays semistructured data, and XML in particular, are becoming the most important way for expressing and exchanging medical knowledge and data: MTGM can be considered as a data model allowing the seamless representation of both (multimedia and temporal) clinical data and knowledge.
收起
摘要 :
In recent years, integrated applications with multimedia devices and wireless sensor networks promoted the evolution of wireless sensor networks, namely wireless multimedia sensor networks (WMSNs). The applications in WMSNs have t...
展开
In recent years, integrated applications with multimedia devices and wireless sensor networks promoted the evolution of wireless sensor networks, namely wireless multimedia sensor networks (WMSNs). The applications in WMSNs have to focus on both energy saving and application-level quality of service (QoS). Due to the characteristics in WMSNs, such as resource constraints and variable channel capacity, efficiently achieving the application-level QoS in WMSNs is a challenging task. To overcome this challenge, in this paper, we proposed a new kind of pattern named temporal region requesting pattern (TRRP) and a novel algorithm named TRRP-Mine for mining TRRPs efficiently. We also designed a temporal region requesting cost function of cache replacement, abbreviated as TRRC, for the cooperative caching multimedia content in WMSNs. Empirical evaluations under various simulation conditions showed that the proposed method delivers excellent performance in terms of hit rate and the number of replacements.
收起
摘要 :
In recent years, the pervasive use of social media has generated huge amounts of data that starts to gain a lot of attentions. Each social media source utilizes different data types such as textual and visual. For example, Twitter...
展开
In recent years, the pervasive use of social media has generated huge amounts of data that starts to gain a lot of attentions. Each social media source utilizes different data types such as textual and visual. For example, Twitter is for a short text message, Flickr is for images and videos, and Facebook allows all of these data types. It is highly desired to find patterns of social media users from such different data formats. With the use of data mining techniques, the social media data opens a lot of opportunities for researchers. Despite of its short history, social media mining has become very active research area. This paper provides a comprehensive survey on recent research on social user mining. In particular, the survey focuses on two aspects: (1) social user mining based on data types, such as textual, visual, and both textual and visual information, and (2) social user mining based on mining techniques. In addition, we present our current research on social user mining as well as its future directions.
收起
摘要 :
Effective and efficient mining of music structure patterns from music query data is one of the most interesting issues of multimedia data mining. In this paper, we introduce a new kind of pattern, called emerging melody structure ...
展开
Effective and efficient mining of music structure patterns from music query data is one of the most interesting issues of multimedia data mining. In this paper, we introduce a new kind of pattern, called emerging melody structure (EMS), for knowledge discovery from music melody streams. EMSs are defined as music data items with melody strings whose support increase significantly from one sliding window to another window from streaming melody sequences. The discovered EMS can be used to predict the future trend of online music style recommendation, to personalize the Web service of music downloading priority, for music composers to compose new music or for service provider to collect more similar music. Therefore, an efficient data mining approach, called MEMSA (Mining Emerging Melody Structure Algorithm), is proposed to discover all EMSs from streaming music query data over sliding windows. In the framework of MEMSA, a prefix tree-based data structure, called EMS-tree (Emerging Melody Structure tree), is constructed for maintaining temporal EMSs effectively. Experimental results show that the proposed method MEMSA is an efficient algorithm for mining all EMSs from streaming melody sequences efficiently.
收起